Self-efficacy measurement instruments for individuals with coronary artery disease: A systematic review

Introduction Over the past decade, there has been a heightened interest in evaluating self-efficacy among patients with coronary artery disease (CAD). A significant number of instruments have been developed and validated, yet the need remains to assess the quality of their studies and their properties. Objectives To evaluate the measurement properties and link the content extracted from self-efficacy instrument items for individuals with CAD to the International Classification of Functioning, Disability, and Health (ICF). Methodology The study was conducted following the Cochrane systematic review guidelines and COnsensus norms for Selection of health Measuring INstruments (COSMIN), registered under CRD42021262613. The search was carried out on MEDLINE (Ovid), Web of Science, EMBASE, and PsycINFO, including studies involving the development and validation of self-efficacy instruments for individuals with CAD, without language or date restrictions. Data extraction was performed in May 2022 and updated in January 2023 and all the steps of this review were carried out by two different collaborators and reviewed by a third when there were divergences. Modified Grading of Recommendations, Assessment, Development and Evaluation (GRADE) recommended by COSMIN was used to determine the quality of evidence as high, moderate, low, or very low. Instrument categorization was carried out per COSMIN recommendations, according to the construct of interest and study population into three categories (A, B, or C). Results A total of 21 studies from 12 instruments were identified. The best-rated instruments received a recommendation of B, which means, additional validation studies are needed. Barnason Efficacy Expectation Scale (BEES) showed high-quality evidence for structural, construct, criterion, and internal consistency validity; Cardiac Self-Efficacy Scale (CSES) demonstrated high quality for content, structural, cross-cultural validity, and internal consistency; Self-efficacy for Appropriate Medication Use (SEAMS) achieved a high level for structural, criterion, and internal consistency validity; Cardiovascular Management Self-Efficacy Scale exhibited high-level validity for structural, criterion, construct, and internal consistency. The CSES showed content linkage with all domains of the ICF, as well as the highest number of linkages with the categories. Conclusions Instruments with a B-level recommendation hold potential for use. More studies assessing measurement properties are needed to reinforce or improve these recommendations. The CSES stands out as the most comprehensive instrument concerning the ICF.


Introduction
Over the past decade, there has been a heightened interest in evaluating self-efficacy among patients with coronary artery disease (CAD).A significant number of instruments have been developed and validated, yet the need remains to assess the quality of their studies and their properties.

Objectives
To evaluate the measurement properties and link the content extracted from self-efficacy instrument items for individuals with CAD to the International Classification of Functioning, Disability, and Health (ICF).

Methodology
The study was conducted following the Cochrane systematic review guidelines and COnsensus norms for Selection of health Measuring INstruments (COSMIN), registered under CRD42021262613.The search was carried out on MEDLINE (Ovid), Web of Science, EMBASE, and PsycINFO, including studies involving the development and validation of self-efficacy instruments for individuals with CAD, without language or date restrictions.Data extraction was performed in May 2022 and updated in January 2023 and all the steps of this review were carried out by two different collaborators and reviewed by a third when there were divergences.Modified Grading of Recommendations, Assessment, Development and Evaluation (GRADE) recommended by COSMIN was used to determine the quality of evidence as high, moderate, low, or very low.Instrument categorization was carried

Introduction
Self-efficacy is related to the individuals' confidence in their ability to gather cognitive, motivational, emotional, and behavioral resources necessary to achieve a goal, deal with a specific situation, or perform a task [1].It also encompasses elements of motivation, planning, organization, and awareness of skills necessary for dealing with illnesses, reflecting a sense of self-responsibility throughout pathological processes [1,2].Consequently, it becomes a crucial factor for health promotion and management of chronic conditions such as coronary artery disease (CAD) [1][2][3].Moreover, it is linked to improved quality of life, mental well-being, and enhanced adherence to rehabilitation processes [4,5].
Several general self-efficacy measurement instruments, whether or not related to specific disease conditions [6], assess specific individual conditions or behaviors associated with or without diseases (e.g., eating behavior, physical activity, and medication adherence) [7][8][9], or evaluate individuals in relation to their diseases (e.g., asthma, stroke, and coronary artery disease (CAD) [10][11][12][13]), are available in the literature.While these instruments are well-established, there is a need to assess their methodological rigor to guide the selection of the most suitable instrument for clinical practice and cardiovascular rehabilitation, considering their quality.
CAD remains one of the leading causes of mortality and morbidity worldwide [14].It is estimated that by 2030, the current prevalence rate of 1,655 per 100,000 population will exceed 1,845 [14].It is characterized by the onset of symptoms such as chest pain, dyspnea, and a sensation of pressure or tightness at varying levels of exertion.Conventional treatment involves the individual's adherence to cardiovascular rehabilitation programs and lifestyle changes, aimed at delaying and preventing future complications, as well as improving physical fitness through resistance and aerobic training [15].
Since the level of self-efficacy can influence adherence to rehabilitation [16], it becomes necessary to have instruments that assess self-efficacy within this population, aiming to enhance treatment and rehabilitation programs adherence [1][2][3].In this context, evaluating self-efficacy instruments for individuals with CAD may contribute to understanding the available tools for clinical practice and research, as well as assisting healthcare professionals in individual interventions.
Therefore, this systematic review aims to evaluate the clinimetric properties of self-efficacy instrument items for people with CAD and to relate their content to the International Classification of Functioning, Disability and Health (ICF) [17].

Materials and methods
The systematic review was conducted following the Cochrane systematic review guidelines [18] and the COnsensus norms for Selection of health Measuring INstruments (COSMIN) [19][20][21] guidelines.The protocol was registered in the International Prospective Registry of Systematic Reviews (PROSPERO) under the registration number CRD42021262613 [22].

Search strategy
The search was conducted in May 2022 in MEDLINE (Ovid), Web of Science, EMBASE, and PsycINFO databases, considering: (1) the construct of interest (cardiac self-efficacy); (2) target population (individuals with CAD); (3) type of instrument (questionnaire or scale); and (4) measurement properties; the latter was assessed using validated search filters for measurement studies previously applied in previous reviews and recommended by COSMIN [23].
Additional searches for relevant studies were conducted manually by checking the reference lists of primary studies and review articles.Searches were repeated prior to the final analysis in January 2023 using a date filter so that we could find studies published after our first search.The search strategies are provided in S1 File.

Study selection
Studies that developed and validated clinimetric properties of self-efficacy measurement instruments for individuals with CAD were included, with no restrictions on publication date or language.
Clinical trials or validation studies using measures reported by third parties, theses, dissertations, and those published as abstracts were excluded.Additionally, studies of instruments that had self-efficacy as part of their construct (e.g., self-management, self-care, self-control) were also excluded, limiting the scope to self-efficacy instruments for patients with CAD or with participant reports of the same diagnosis in the study.
The search results were imported into the reference management tool Mendeley (https:// www.mendeley.com).Duplicates were removed prior to the selection process, and the reference list was exported to the systematic review platform Rayyan Qatar Computing Research Institute (https://rayyan.qcri.org)[24].
Two independent authors (JABA and KSM) selected the studies simultaneously based on titles and abstracts.After this step, the same authors conducted independent and simultaneous full-text readings and documented reasons for excluding ineligible studies.In cases of disagreement, a meeting was held for discussion and consultation with a third reviewer (LPG).

Data extraction
Data extraction was carried out by two authors (JABA and LPG) following COSMIN and Cochrane [18][19][20].The extracted information included title, authors, year of publication, general instrument characteristics (construct, subscales, number of items, version, score), study design, target population, sample size, individual characteristics (e.g., age range, gender, research location, country, language, selection methods), and clinimetric properties.A third reviewer (KSM) was consulted for reviewing the extracted data.

Study quality
The methodological quality of the studies was assessed by two independent authors (RBF and JCL) using the COSMIN RoB Checklist [25,26].This tool considers 10 measurement properties and consists of boxes with 3 to 35 items addressing aspects of comprehensiveness, relevance, and inclusiveness of the items included in the instrument.Each box assigns a methodological quality score for instrument development: (1) content validity, (2) structural validity, (3) internal consistency, (4) cross-cultural validity, (5) measurement invariance, (6) reliability, (7) measurement error, (8) criterion validity, (9) construct validity, and (10) responsiveness.Each item has four response options: inadequate (I), doubtful (D), adequate (A), and very good (V) [25].Disagreements were resolved by a third author (KSM).
The content extracted from the measurement instruments was linked using the Comprehensive International Classification of Functioning, Disability and Health (ICF) Core Set following recommendations of Cieza et al. (2016) [17,[27][28][29] conducted by two separate authors (JABA and RBF).Subsequently, a third author (JCL) reviewed the contents in case of discrepancies.

Data synthesis
Initially, a narrative synthesis of the results was prepared.In cases where the same instrument was validated for different populations, the assessment of measurement properties was performed considering a single instrument, with the particularity of each version being discussed.A combination of measurement properties determined the overall evidence of the instrument.Studies were grouped based on the similarity of instrument versions.
The results were assessed in groups or summarized in relation to the measurement property criteria to determine whether they were sufficient (+), insufficient (-), inconsistent (±), or indeterminate (?).The criteria were also subjectively evaluated by the reviewers (JABA and LPG), according to COSMIN criteria [19].Additionally, the Modified Grading of Recommendations, Assessment, Development and Evaluation (GRADE) recommended by COSMIN was used to determine the quality of evidence as high, moderate, low, or very low [19,30].
Subsequently, the instruments were categorized and justified according to COSMIN recommendations [21], considering the construct of interest and study population into three categories: (A) the instrument is recommended for use and the results are reliable; (B) when it can be recommended but requires further validation studies; and (C) the instrument should not be recommended due to insufficient properties.

Results
A total of 4420 references were identified from the EMBASE (n = 1126), MEDLINE (n = 995), PsycoINFO (n = 564), and Web of Science (n = 1735) databases.1304 duplicates were removed, leaving a total of 3116 for screening.Considering eligibility criteria, 3072 studies were excluded.Out of the remaining 44 studies, 28 were excluded due to not being available in full text (n = 6), having self-efficacy in only one of the instrument domains (n = 5), not including participants with a diagnosis of CAD (n = 13), and being clinical trials (n = 3), resulting in 16 studies included.Additionally, a search in the reference lists of included studies was conducted and 5 studies were included in the results, totaling 21 articles for the review (Fig 1 ).

Characteristics of included studies and instruments
The included studies were published from 1998 to 2021.All of them aimed to develop or validate self-efficacy instruments specific to cardiovascular diseases or general ones that included individuals with CAD in their processes.The 21 included articles correspond to 12 self-efficacy measurement instruments in different versions available in the English language.Among the 21, 6 studies regards to instrument development, and 15 to validation.
Furthermore, the CSES [12,13,35,42,45] and ESE [37,41,43] showed the highest number of versions.Therefore, the CSES was considered the instrument with the broadest dissemination and is the oldest instrument included [12].Regarding feasibility, the studies were not clear or did not provide data on factors such as time required for completion, instrument length, intellectual level required to respond, and facilitators.Concerning completion time, only the Arabic version of ESE [41] and the Thai CSES version [45] reported needing 20 and 15 minutes, respectively.No study reported whether a license is required for their use.
Although structural, content, and internal consistency validities were assessed in most studies, criterion, construct, cross-cultural validities, reliability, measurement error, and responsiveness were evaluated in a limited number of studies, either due to a complete or partial absence of data.Regarding reliability, the lack of data can negatively affect the reproducibility of consistent results.Measurement invariance was not assessed as no study presented such a property.

Methodological quality of the studies
The data obtained from the methodological assessment of the studies are summarized in Table 2. Nine out of the ten clinimetric properties were evaluated, with the exception of measurement invariance, which was not mentioned by any study.
The general requirements for development were satisfactorily met (e.g., clear description of the construct, clear description of the target population for which the instrument is intended, and its context of use).However, the studies either did not provide or did not demonstrate clarity in at least one requirement, leading to their classification as "inadequate" (Table 2).
Regarding reliability, no study received a good methodological evaluation.All studies were considered "inadequate" in at least one of the eight assessed requirements.This is due to not reporting data on the Kappa agreement coefficient test and intraclass correlation coefficient.
The construct validity of the studies was conducted through hypothesis testing, mainly focusing on convergent and discriminant validity.Only one study reported parameters through divergent validity [44].Nine studies were classified as "inadequate" due to lack of clarity or absence of measurement property data [8,31,32,34,36,37,40,41,46,47].For  responsiveness, only two studies provided relevant information for this property and were considered "adequate" [37,39].

Summary of quality and level of evidence
Only one instrument demonstrated low quality in its internal consistency [39].Content validity, structural validity, cross-cultural validity, criterion validity, and construct validity showed mixed qualities.Reliability, measurement error, and responsiveness exhibited low overall quality.Table 3 provides a summary of the evaluated measurement properties.
• Bandura's Exercise Self-Efficacy Scale (ESE): assessed in three versions adapted for distinct populations.Content validity was unsatisfactory in all versions.Structural validity was deemed unsatisfactory in the Australian version [37] (with factor loadings > 0.40).Internal consistency received an inadequate assessment in the South Korean version [43] for not providing Cronbach's alpha for each domain.Cross-cultural validity and construct validity received mixed evaluations, and only the South Korean version [43] was considered threedimensional.The overall quality of evidence for the instrument was mixed.Only internal consistency was deemed high.Structural, criterion, and construct validities were rated as moderate, and content validity was rated as low.Therefore, it was categorized as a level C recommendation.
• Barnason Efficacy Expectation Scale (BEES) [48]: Content validity was considered inconsistent.Structural validity (factor loading > 0.40), internal consistency (Cronbach's alpha 0.92), and construct validity were deemed sufficient.The latter indicated that the instrument is unidimensional.The instrument provided insufficient data for reliability, measurement error, and responsiveness.The quality of evidence received mixed ratings, with internal consistency, structural validity, criterion validity, and construct validity rated as high.However, content validity was considered low.Therefore, it was categorized as a level B recommendation.
• Cardiac Exercise Self-Efficacy Scale Persian Version (CESE) [44]: Structural validity (factor loading > 0.45) and construct validity were considered sufficient.Its exploratory analysis identified a 4-factor structure (knowledge, overcoming barriers, time management, and recovery).However, it did not provide Cronbach's alpha values for each domain, resulting in an inadequate rating.Cross-cultural validity was satisfactory.The quality of evidence for the instrument received mixed ratings, with structural validity, cross-cultural validity, and construct validity considered high, while internal consistency and content validity were rated low by the assessors.The instrument was categorized as a level C recommendation.
• Cardiac Self-Efficacy Scale (CSES): The instrument was evaluated in its original version [12] and four adaptations for different populations.Content validity was considered satisfactory in all versions.Structural validity was considered inconsistent only in the original version [12] and satisfactory in all four adaptations [13,42,35,45].The original model [12] and the Arabic version [45] presented as bidimensional, while the other adapted versions presented as tridimensional models.The versions showed discrepancies in assessing internal consistency, as the Chinese [35] and Swedish [13] versions did not report individual Cronbach's alpha values for their three domains.Construct validity received mixed evaluations.The quality of evidence for the instrument received mixed ratings, with high content validity, structural validity, cross-cultural validity, and internal consistency.The instrument was categorized as a level B recommendation.• Cardiovascular Management Self-Efficacy Scale [39]: Structural validity (tridimensional model with factor loadings from 0.86 to 0.95), construct validity, criterion validity, and internal consistency received satisfactory ratings, but content validity obtained an insufficient rating.The quality of evidence for the instrument received mixed ratings, with only structural validity, criterion validity, construct validity, and internal consistency receiving high evaluations.The instrument was classified as a level B recommendation.
• Cholesterol-Lowering Diet Self-Efficacy Scale [31]: Content validity was considered inconsistent, structural validity (no data provided in the study) and construct validity were rated as unsatisfactory, while internal consistency (Cronbach's alpha >0.93) and criterion validity were rated as satisfactory.The quality of evidence for the instrument received mixed ratings, with six properties considered low, including content validity and structural validity.Therefore, the instrument was classified as level C evidence.
• Food Pyramid Self-Efficacy Scale (FPSES) [32]: The instrument received satisfactory ratings for content validity and internal consistency (Cronbach's alpha 0.92).However, structural validity (no factor analysis performed) and criterion validity were considered unsatisfactory.The quality of evidence for measurement properties received mostly low ratings, with high ratings only for content validity and internal consistency.Therefore, the instrument was classified as level C recommendation.
• General Perceived Self-Efficacy Scale (GSE) [40]: The validated instrument received unsatisfactory ratings for content, criterion, and construct validity.It received a satisfactory rating only for internal consistency (Cronbach's alpha 0.85).Structural validity was considered inconsistent.The quality of evidence for measurement properties received mixed evaluations, but predominantly low, with high ratings only for cross-cultural validity and internal consistency.Therefore, the instrument was classified as level C recommendation.
• Scale to Measure Self-Efficacy and Self-Management in People With Coronary Heart Disease (HH-SESM Scale) [38]: The developed instrument received inconsistent ratings for content and structural validity, with a satisfactory rating only for internal consistency (Cronbach's alpha 0.83 for self-efficacy subscale).In terms of evidence quality, the instrument received predominantly low ratings, including content and structural validity, with high rating only for internal consistency.Therefore, the instrument was classified as level C recommendation.
• Self-Efficacy for Appropriate Medication Use (SEAMS): In the evaluation of the original version [33] of the instrument and the Brazilian version, found in two studies [46,47], structural validity, criterion validity, and internal consistency were considered satisfactory.However, content validity for both versions was considered insufficient, but they differed in terms of construct validity assessment.The original version [33] was evaluated as unsatisfactory.The original study has factor loadings > 0.40 and is considered four-dimensional [33], while the Brazilian version is bidimensional [46,47].Regarding evidence quality, the grouped instrument received mixed evaluations, being rated as high only in terms of structural validity, cross-cultural validity, criterion validity, and internal consistency.Content validity was classified as low.Therefore, the instrument was classified as level B recommendation.
• Self-Efficacy for Exercise Scale Chinese Version (SEE-C) [8]: The instrument received an unsatisfactory rating for content validity, moderate ratings for cross-cultural and construct validity (factor loading > 0.64, and considered unidimensional).Structural validity, criterion validity, and internal consistency (Cronbach's alpha > 0.90) received satisfactory evaluations.Regarding evidence quality, the instrument received mixed assessments, receiving high ratings only for structural validity, criterion validity, and internal consistency.Content validity was considered low.Therefore, the instrument was classified as a level C recommendation.
• Tai Chi Exercise Self-Efficacy Performance (TCSE): The validated instruments received satisfactory ratings in terms of internal consistency (Cronbach's alpha > 0.95) and construct validity (factor loading > 0.40, considered bidimensional).They received mixed ratings in terms of content validity (varying from insufficient for the US version [34] to unsatisfactory for the Chinese version [36]), structural validity, and cross-cultural validity.In terms of the quality assessment of aggregated evidence, the instrument received mixed ratings, with high ratings for construct validity and internal consistency.Content validity and structural validity were considered moderate.Therefore, the instrument was classified as a level C recommendation.

Linking of items from measurement instruments with the international classification of functioning, disability and health
Table 4 presents the results of linking the extracted items from the measurement instruments to the ICF.It was not possible to link the FPSES instrument [32], as it is not available in its study or on the web.We did not receive a response from the authors after email contact.Only five items could not be linked as they corresponded to personal factors and were present in the versions of the ESE instrument [37,41,43], and the HH-SESM Scale [38].A total of 321 concepts were identified from the 276 items of the 19 instruments.Regarding the process of linking to the ICF, categories were related to the majority of the instruments (originals and their adapted versions).The component body functions (b) was linked to all 19 instruments, the component body structures (s) was linked to 6 instruments, the component activities and participation (d) was linked to 18 instruments, and finally, the component environmental factors (e) was linked to 17 instruments.The chapters with the highest number of links were the chapters on mental functions (b1) and functions of the cardiovascular, hematological and immunological systems, and the respiratory system (b4).
The instrument with the highest number of concept links to the ICF was the Cardiac Self-Efficacy Scale (CSES), although there was a slight divergence in its 5 versions included in the study, with the Arabic version [33] having 25 (the highest number of linkages), the original [12], Swedish [41], and Thai [44] versions having 23, and the Chinese version having the lowest number of linkages, totaling 21.

Discussion
This systematic review identified twenty-one studies regarding the development and/or validation of 12 instruments that assess cardiac self-efficacy, medication, exercise, cardiac surgery, rehabilitation, nutrition, lifestyle, and risk factors in subjects with CAD.Instruments were classified as evidence levels B and C.Among the included instruments the CSES was the most disseminated as well as the instrument with the highest number of concept links to the ICF.
Following the COSMIN guidelines, all studies exhibit methodological failures in several significant information regarding measurement properties.These findings corroborate with a systematic review performed by Frei et al (2009) [49], in which they identified a large number of self-efficacy instruments for subjects with chronic diseases.However, all of them demonstrated significant limitations in their development and validation [49].
Considering the modified GRADE [19] no instrument was classified as evidence level A. The BEES, CSES, SEAMS, and CMSES were categorized as level B, suggesting they may be recommended for use when assessing the target population.A systematic review performed by Kavradim et al (2020) [50] focusing solely on self-efficacy instruments for cardiac purposes in general population with cardiovascular diseases found similar results which corroborates with our findings to the CSES and CMSES.Another review performed by Lamarche, Tejpal, and Mangin (2018) [51] assessing self-efficacy instruments in medication management found that SEAMS instrument [33] is the most appropriate self-efficacy scale.
Structural validity provides evidence through exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) [19].However, such data were not found in some instruments [31,32,34,38], leading to their downgraded classifications.The CSES and CMSES instruments demonstrated high quality for structural validity, consistent with the review by Kavradim et al (2020) [50].It was also observed divergence in dimensions and item numbers between the different versions of CSES [12,13,35,42,45] and ESE [37,41,43].These differences may be related to cultural, educational, and socioeconomic factors of each country.
All studies measured internal consistency using Cronbach's alpha, but only a minority conducted test-retest reliability analyses.These data corroborate with the findings of Frei et al (2009) [49], Kavradim et al (2020) [50], and Lamarche, Tejpal, and Mangin (2018) [51].We recommend that validation processes include relevant tests to the instrument's purpose, and every validation should incorporate test-retest reliability analysis, preferably using intraclass correlation coefficients [25].
Regarding the ICF, the CSES instrument in all its versions showed the highest number of linkages of its items with the four ICF categories and codes [50].This may be related to the diversity of content assessed in its items, which can increase the range of codes, making it the most comprehensive instrument included in the review.As the ICF classification is a reference in clinical practice, teaching, and research language, this data becomes relevant [29,50].
All item contents that could not be linked to the ICF referred to factors not yet covered by it and were present in all ESE instrument versions [37,41,43], and the HH-SESM Scale [38].This reinforces the reported importance of identifying these contents in linkage studies to strengthen the inclusion of additional factors in the ICF in the future [51,52].
The findings of this review show that all instruments assessing self-efficacy for individuals with CAD have some shortcomings in their measurement properties.Therefore, it is recommended to develop more robust self-efficacy instruments for individuals with CAD with fewer biases that could compromise the measured results.We believe that the results of this review can contribute to the selection of appropriate instruments for assessing self-efficacy levels in CAD in different contexts.
Moreover, considering the evidence level found in this review, we consider that the use of such instruments in clinical practice, research and teaching may be carefully assessed as improvements in measurement properties are needed for better evaluation.In this way, the choice of health professionals, researchers and academics should be based on validated instruments according to the target population as well as those with appropriate measurement properties and greater content linkage with the ICF for language standardization.
It is worth highlighting that this is the first systematic review assessing self-efficacy instruments for CAD as well as linking self-efficacy instruments to the ICF.This study was also conducted in accordance with the recommendations of Cochrane [18] and COSMIN [19,25] by two independent authors without any language or time restriction.Furthermore, the PRISMA 2020 Main Checklist was adopted resulting in a more transparent, comprehensive, and accurate review (S1 Checklist).Although we have limited the inclusion criteria to self-efficacy instruments for coronary patients, the heterogeneity of the included instruments may have made the discussion of measurement properties more challenging.Another limitation may be related to the lack of patient and public involvement in its development.

Conclusion
Despite the large number of instruments assessing self-efficacy in individuals with CAD, none of them showed strong properties regarding the procedures adopted for their development and measurement validity.The best evidence level found was categorized as B which means a potential to be recommended.However, further clinimetric studies in accordance with COS-MIN are required for evaluating self-efficacy in individuals with CAD.Regarding the linkage with the ICF, CSES had the highest number of linkages with ICF codes in the categories of body functions and structures, activities and participation, and environmental factors.In this way, the CSES may be considered the most comprehensive instrument assessed in this study, considering the importance of the ICF in standardizing clinical language.